Fast Value Iteration for Goal-Directed Markov Decision Processes
نویسندگان
چکیده
P lanning problems where effects of actions are non-deterministic can be modeled a8 Markov decision processes. Planning prob lems are usually goal-directed. This paper proposes several techniques for exploiting the goal-directedness to accelerate value itera tion, a standard algorithm for solving Markov decision processes. Empirical studies have shown that the techniques can bring about significant speedups.
منابع مشابه
Accelerated decomposition techniques for large discounted Markov decision processes
Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...
متن کاملUtilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs
Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...
متن کاملA fast point-based algorithm for POMDPs
We describe a point-based approximate value iteration algorithm for partially observable Markov decision processes. The algorithm performs value function updates ensuring that in each iteration the new value function is an upper bound to the previous value function, as estimated on a sampled set of belief points. A randomized belief-point selection scheme allows for fast update steps. Results i...
متن کاملA Method for Speeding Up Value Iteration in Partially Observable Markov Decision Processes
We present a technique for speeding up the convergence of value iteration for par tially observable Markov decisions processes (POMDPs). The underlying idea is similar to that behind modified policy iteration for fully observable Markov decision processes (MDPs). The technique can be easily incor porated into any existing POMDP value it eration algorithms. Experiments have been conducted on ...
متن کاملSymbolic Stochastic Focused Dynamic Programming with Decision Diagrams
We present a stochastic planner based on Markov Decision Processes (MDPs) that participates to the probabilistic planning track of the 2006 International Planning Competition. The planner transforms the PPDDL problems into factored MDPs that are then solved with a structured modified value iteration algorithm based on the safest stochastic path computation from the initial states to the goal st...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997